{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Parameter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 概述\n",
    "`Parameter`是变量张量，代表在训练网络时，需要被更新的参数。本章主要介绍了`Parameter`的初始化以及属性和方法的使用，同时介绍了`ParameterTuple`。\n",
    "\n",
    "> 本文档适用于CPU、GPU和Ascend环境。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 初始化"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "```python\n",
    "mindspore.Parameter(default_input, name, requires_grad=True, layerwise_parallel=False)\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "初始化一个`Parameter`对象，传入的数据支持`Tensor`、`Initializer`、`int`和`float`四种类型。\n",
    "\n",
    "`Initializer`是初始化器，可调用`initializer`接口生成`Initializer`对象。\n",
    "\n",
    "当网络采用半自动或者全自动并行策略，并且使用`Initializer`初始化`Parameter`时，`Parameter`里保存的不是`Tensor`，而是`MetaTensor`。\n",
    "\n",
    "`MetaTensor`与`Tensor`不同，`MetaTensor`仅保存张量的形状和类型，而不保存实际数据，所以不会占用任何内存，可调用`init_data`接口将`Parameter`里保存的`MetaTensor`转化为`Tensor`。\n",
    "\n",
    "可为每个`Parameter`指定一个名称，便于后续操作和更新。\n",
    "\n",
    "当参数需要被更新时，需要将`requires_grad`设置为`True`。\n",
    "\n",
    "当`layerwise_parallel`（混合并行）配置为`True`时，参数广播和参数梯度聚合时会过滤掉该参数。\n",
    "\n",
    "有关分布式并行的相关配置，可以参考文档：https://www.mindspore.cn/doc/programming_guide/zh-CN/master/auto_parallel.html 。\n",
    "\n",
    "下例通过三种不同的数据类型构造了`Parameter`，三个`Parameter`都需要更新，都不采用layerwise并行。如下："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-01-05T08:01:34.482321Z",
     "start_time": "2021-01-05T08:01:33.740214Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Parameter (name=x) \n",
      "\n",
      " Parameter (name=y) \n",
      "\n",
      " Parameter (name=z)\n"
     ]
    }
   ],
   "source": [
    "import numpy as np\n",
    "from mindspore import Tensor, Parameter\n",
    "from mindspore import dtype as mstype\n",
    "from mindspore.common.initializer import initializer\n",
    "\n",
    "x = Parameter(default_input=Tensor(np.arange(2*3).reshape((2, 3))), name=\"x\")\n",
    "y = Parameter(default_input=initializer('ones', [1, 2, 3], mstype.float32), name='y')\n",
    "z = Parameter(default_input=2.0, name='z')\n",
    "\n",
    "print(x, \"\\n\\n\", y, \"\\n\\n\", z)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 属性\n",
    "* `inited_param`：返回保存了实际数据的`Parameter`，如果`Parameter`原本保存的是`MetaTensor`，会将其转换为`Tensor`。\n",
    "\n",
    "* `name`：实例化`Parameter`时，为其指定的名字。\n",
    "\n",
    "* `sliced`：用在自动并行场景下，表示`Parameter`里保存的数据是否是分片数据。\n",
    "\n",
    "    如果是，就不再对其进行切分，如果不是，需要根据网络并行策略确认是否对其进行切分。\n",
    "    \n",
    "\n",
    "* `is_init`：`Parameter`的初始化状态。在GE后端，`Parameter`需要一个`init graph`来从主机同步数据到设备侧，该标志表示数据是否已同步到设备。 此标志仅在GE后端起作用，其他后端将被设置为False。\n",
    "\n",
    "* `layerwise_parallel`：`Parameter`是否支持layerwise并行。如果支持，参数就不会进行广播和梯度聚合，反之则需要。\n",
    "\n",
    "* `requires_grad`：是否需要计算参数梯度。如果参数需要被训练，则需要计算参数梯度，否则不需要。\n",
    "\n",
    "* `data`： `Parameter`本身。\n",
    "\n",
    "下例通过`Tensor`初始化一个`Parameter`，获取了`Parameter`的相关属性。如下："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-01-05T08:01:34.488932Z",
     "start_time": "2021-01-05T08:01:34.483860Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "name:  x \n",
      " sliced:  False \n",
      " is_init:  False \n",
      " inited_param:  None \n",
      " requires_grad:  True \n",
      " layerwise_parallel:  False \n",
      " data:  Parameter (name=x)\n"
     ]
    }
   ],
   "source": [
    "import numpy as np\n",
    "\n",
    "from mindspore import Tensor, Parameter\n",
    "\n",
    "x = Parameter(default_input=Tensor(np.arange(2*3).reshape((2, 3))), name=\"x\")\n",
    "\n",
    "print(\"name: \", x.name, \"\\n\",\n",
    "      \"sliced: \", x.sliced, \"\\n\",\n",
    "      \"is_init: \", x.is_init, \"\\n\",\n",
    "      \"inited_param: \", x.inited_param, \"\\n\",\n",
    "      \"requires_grad: \", x.requires_grad, \"\\n\",\n",
    "      \"layerwise_parallel: \", x.layerwise_parallel, \"\\n\",\n",
    "      \"data: \", x.data)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 方法\n",
    "* `init_data`：在网络采用半自动或者全自动并行策略的场景下， 当初始化`Parameter`传入的数据是`Initializer`时，可调用该接口将`Parameter`保存的数据转换为`Tensor`。\n",
    "\n",
    "* `set_data`：设置`Parameter`保存的数据，支持传入`Tensor`、`Initializer`、`int`和`float`进行设置， 将方法的入参`slice_shape`设置为True时，可改变`Parameter`的shape，反之，设置的数据shape必须与`Parameter`原来的shape保持一致。\n",
    "\n",
    "* `set_param_ps`：控制训练参数是否通过[Parameter Server](https://www.mindspore.cn/tutorial/training/zh-CN/master/advanced_use/apply_parameter_server_training.html)进行训练。\n",
    "\n",
    "* `clone`：克隆`Parameter`，需要指定克隆之后的参数名称。\n",
    "\n",
    "下例通过`Initializer`来初始化`Tensor`，调用了`Parameter`的相关方法。如下："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-01-05T08:02:49.825170Z",
     "start_time": "2021-01-05T08:02:49.818485Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Parameter (name=Parameter)\n",
      "Parameter (name=x_clone)\n",
      "Parameter (name=Parameter)\n",
      "Parameter (name=Parameter)\n"
     ]
    }
   ],
   "source": [
    "import numpy as np\n",
    "\n",
    "from mindspore import Tensor, Parameter\n",
    "from mindspore import dtype as mstype\n",
    "from mindspore.common.initializer import initializer\n",
    "\n",
    "x = Parameter(default_input=initializer('ones', [1, 2, 3], mstype.float32))\n",
    "\n",
    "print(x)\n",
    "x_clone = x.clone()\n",
    "x_clone.name = \"x_clone\"\n",
    "print(x_clone)\n",
    "\n",
    "print(x.init_data())\n",
    "print(x.set_data(data=Tensor(np.arange(2*3).reshape((1, 2, 3)))))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## ParameterTuple\n",
    "继承于`tuple`，用于保存多个`Parameter`，通过`__new__(cls, iterable)`传入一个存放`Parameter`的迭代器进行构造，提供`clone`接口进行克隆。\n",
    "\n",
    "下例构造了一个`ParameterTuple`对象，并进行了克隆。如下："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2021-01-05T08:02:53.034919Z",
     "start_time": "2021-01-05T08:02:53.026182Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(Parameter (name=x), Parameter (name=y), Parameter (name=z)) \n",
      "\n",
      "(Parameter (name=params_copy.x), Parameter (name=params_copy.y), Parameter (name=params_copy.z))\n"
     ]
    }
   ],
   "source": [
    "import numpy as np\n",
    "from mindspore import Tensor, Parameter, ParameterTuple\n",
    "from mindspore import dtype as mstype\n",
    "from mindspore.common.initializer import initializer\n",
    "\n",
    "x = Parameter(default_input=Tensor(np.arange(2*3).reshape((2, 3))), name=\"x\")\n",
    "y = Parameter(default_input=initializer('ones', [1, 2, 3], mstype.float32), name='y')\n",
    "z = Parameter(default_input=2.0, name='z')\n",
    "params = ParameterTuple((x, y, z))\n",
    "params_copy = params.clone(\"params_copy\")\n",
    "print(params, \"\\n\")\n",
    "print(params_copy)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
